6 research outputs found
Nowhere to Hide: Cross-modal Identity Leakage between Biometrics and Devices
Along with the benefits of Internet of Things (IoT) come potential privacy risks, since billions of the connected devices are granted permission to track information about their users and communicate it to other parties over the Internet. Of particular interest to the adversary is the user identity which constantly plays an important role in launching attacks. While the exposure of a certain type of physical biometrics or device identity is extensively studied, the compound effect of leakage from both sides remains unknown in multi-modal sensing environments. In this work, we explore the feasibility of the compound identity leakage across cyber-physical spaces and unveil that co-located smart device IDs (e.g., smartphone MAC addresses) and physical biometrics (e.g., facial/vocal samples) are side channels to each other. It is demonstrated that our method is robust to various observation noise in the wild and an attacker can comprehensively profile victims in multi-dimension with nearly zero analysis effort. Two real-world experiments on different biometrics and device IDs show that the presented approach can compromise more than 70\% of device IDs and harvests multiple biometric clusters with ~94% purity at the same time
Autonomous Learning of Speaker Identity and WiFi Geofence From Noisy Sensor Data
A fundamental building block towards intelligent environments is the ability to understand who is present in a certain area. A ubiquitous way of detecting this is to exploit unique vocal characteristics as people interact with one another in common spaces. However, manually enrolling users into a biometric database is time-consuming and not robust to vocal deviations over time. Instead, consider audio features sampled during a meeting, yielding a noisy set of possible voiceprints. With a number of meetings and knowledge of participation, e.g., sniffed wireless Media Access Control (MAC) addresses, can we learn to associate a specific identity with a particular voiceprint? To address this problem, this paper advocates an Internet of Things (IoT) solution and proposes to use co-located WiFi as supervisory weak labels to automatically bootstrap the labelling process. In particular, a novel cross-modality labelling algorithm is proposed that jointly optimises the clustering and association process, which solves the inherent mismatching issues arising from heterogeneous sensor data. At the same time, we further propose to reuse the labelled data to iteratively update wireless geofence models and curate device specific thresholds. Extensive experimental results from two different scenarios demonstrate that our proposed method is able to achieve 2-fold improvement in labelling compared with conventional methods and can achieve reliable speaker recognition in the wild
MatrixCity: A Large-scale City Dataset for City-scale Neural Rendering and Beyond
Neural radiance fields (NeRF) and its subsequent variants have led to
remarkable progress in neural rendering. While most of recent neural rendering
works focus on objects and small-scale scenes, developing neural rendering
methods for city-scale scenes is of great potential in many real-world
applications. However, this line of research is impeded by the absence of a
comprehensive and high-quality dataset, yet collecting such a dataset over real
city-scale scenes is costly, sensitive, and technically difficult. To this end,
we build a large-scale, comprehensive, and high-quality synthetic dataset for
city-scale neural rendering researches. Leveraging the Unreal Engine 5 City
Sample project, we develop a pipeline to easily collect aerial and street city
views, accompanied by ground-truth camera poses and a range of additional data
modalities. Flexible controls over environmental factors like light, weather,
human and car crowd are also available in our pipeline, supporting the need of
various tasks covering city-scale neural rendering and beyond. The resulting
pilot dataset, MatrixCity, contains 67k aerial images and 452k street images
from two city maps of total size . On top of MatrixCity, a thorough
benchmark is also conducted, which not only reveals unique challenges of the
task of city-scale neural rendering, but also highlights potential improvements
for future works. The dataset and code will be publicly available at our
project page: https://city-super.github.io/matrixcity/.Comment: Accepted to ICCV 2023. Project page:
$\href{https://city-super.github.io/matrixcity/}{this\, https\, URL}
OmniCity: Omnipotent City Understanding with Multi-level and Multi-view Images
This paper presents OmniCity, a new dataset for omnipotent city understanding
from multi-level and multi-view images. More precisely, the OmniCity contains
multi-view satellite images as well as street-level panorama and mono-view
images, constituting over 100K pixel-wise annotated images that are
well-aligned and collected from 25K geo-locations in New York City. To
alleviate the substantial pixel-wise annotation efforts, we propose an
efficient street-view image annotation pipeline that leverages the existing
label maps of satellite view and the transformation relations between different
views (satellite, panorama, and mono-view). With the new OmniCity dataset, we
provide benchmarks for a variety of tasks including building footprint
extraction, height estimation, and building plane/instance/fine-grained
segmentation. Compared with the existing multi-level and multi-view benchmarks,
OmniCity contains a larger number of images with richer annotation types and
more views, provides more benchmark results of state-of-the-art models, and
introduces a novel task for fine-grained building instance segmentation on
street-level panorama images. Moreover, OmniCity provides new problem settings
for existing tasks, such as cross-view image matching, synthesis, segmentation,
detection, etc., and facilitates the developing of new methods for large-scale
city understanding, reconstruction, and simulation. The OmniCity dataset as
well as the benchmarks will be available at
https://city-super.github.io/omnicity
BungeeNeRF: Progressive Neural Radiance Field for Extreme Multi-scale Scene Rendering
Neural radiance fields (NeRF) has achieved outstanding performance in
modeling 3D objects and controlled scenes, usually under a single scale. In
this work, we focus on multi-scale cases where large changes in imagery are
observed at drastically different scales. This scenario vastly exists in
real-world 3D environments, such as city scenes, with views ranging from
satellite level that captures the overview of a city, to ground level imagery
showing complex details of an architecture; and can also be commonly identified
in landscape and delicate minecraft 3D models. The wide span of viewing
positions within these scenes yields multi-scale renderings with very different
levels of detail, which poses great challenges to neural radiance field and
biases it towards compromised results. To address these issues, we introduce
BungeeNeRF, a progressive neural radiance field that achieves level-of-detail
rendering across drastically varied scales. Starting from fitting distant views
with a shallow base block, as training progresses, new blocks are appended to
accommodate the emerging details in the increasingly closer views. The strategy
progressively activates high-frequency channels in NeRF's positional encoding
inputs and successively unfolds more complex details as the training proceeds.
We demonstrate the superiority of BungeeNeRF in modeling diverse multi-scale
scenes with drastically varying views on multiple data sources (city models,
synthetic, and drone captured data) and its support for high-quality rendering
in different levels of detail.Comment: Accepted to ECCV22; Previous version: CityNeRF: Building NeRF at City
Scale; Project page can be found in https://city-super.github.io/cityner